Diamonds in the Rough: Event Extraction from Imperfect Microblog Data

نویسندگان

  • Ander Intxaurrondo
  • Eneko Agirre
  • Oier Lopez de Lacalle
  • Mihai Surdeanu
چکیده

We introduce a distantly supervised event extraction approach that extracts complex event templates from microblogs. We show that this near real-time data source is more challenging than news because it contains information that is both approximate (e.g., with values that are close but different from the gold truth) and ambiguous (due to the brevity of the texts), impacting both the evaluation and extraction methods. For the former, we propose a novel, “soft”, F1 metric that incorporates similarity between extracted fillers and the gold truth, giving partial credit to different but similar values. With respect to extraction methodology, we propose two extensions to the distant supervision paradigm: to address approximate information, we allow positive training examples to be generated from information that is similar but not identical to gold values; to address ambiguity, we aggregate contexts across tweets discussing the same event. We evaluate our contributions on the complex domain of earthquakes, with events with up to 20 arguments. Our results indicate that, despite their simplicity, our contributions yield a statistically-significant improvement of 33% (relative) over a strong distantly-supervised system. The dataset containing the knowledge base, relevant tweets and manual annotations is publicly available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

News Feature Extraction for Events on Social Network Platforms

Microblog-based social network platforms like Twitter and Sina Weibo have been important sources for news event extraction. However, existing works on microblog event extraction, which usually use keywords, entities, or selected microblogs to represent events, are not able to extract details of an event. Based on the view of news report, an event should present detailed news features, i.e., whe...

متن کامل

Overview of the FIRE 2016 Microblog track: Information Extraction from Microblogs Posted during Disasters

The FIRE 2016 Microblog track focused on retrieval of microblogs (tweets posted on Twitter) during disaster events. A collection of about 50,000 microblogs posted during a recent disaster event was made available to the participants, along with a set of seven practical information needs during a disaster situation. The task was to retrieve microblogs relevant to these needs. 10 teams participat...

متن کامل

Exploiting Community Emotion for Microblog Event Detection

Microblog has become a major platform for information about real-world events. Automatically discovering realworld events from microblog has attracted the attention of many researchers. However, most of existing work ignore the importance of emotion information for event detection. We argue that people’s emotional reactions immediately reflect the occurring of real-world events and should be im...

متن کامل

Microblog Track 2011 of FDU

Twitter provides huge amount of short messages, raises challenge problems to the research community. The Microblog Track of TREC detects the special behavior of the twitter dataset in the “real-time” retrieval task. This paper reports our participation in the Microblog Track task. Given the query topics, each participants are required to conduct a “real-time” retrieval task, which seeks for the...

متن کامل

Restoring the past glory of Diamond Mining in south India- A plausible case of diamondiferous Wajrakarur kimberlite pipe clusters with geochemical evidences

A plausible case of collective and economical mining of diamondiferous kimberlite deposits of Wajrakarur and adjoining places in Andhra Pradesh, southern India along with the whole-rock geochemical evidences in support of their diamond potentiality are discussed in this article. The kimberlites/lamproites are mantle-derived ultrabasic rocks which rarely carry diamonds from mantle to the earth’s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015